Skip to content

Conversation

@dxqb
Copy link
Collaborator

@dxqb dxqb commented Jan 16, 2026

git fetch origin pull/1261/head:pr-1261
git switch pr-1261

update.sh or update.bat

  • requires testing
  • requires experimentation with quantization and training parameters - the preset is very preliminary
  • might break Qwen-Image: this PR updates diffusers to a commit that includes Flux2-Klein - but that commit also fixes some bugs in their Qwen-Image pipeline. The workarounds for these bugs have to be removed in OneTrainer
  • there is an issue with masked training in this branch
  • rename "Safetensors" output format

includes #1237
see e50970f for code changes only in this PR

@yamatazen
Copy link

No 4B model?

@dxqb
Copy link
Collaborator Author

dxqb commented Jan 17, 2026

No 4B model?

works with both

@yamatazen
Copy link

Where is the 4B preset?

@dxqb
Copy link
Collaborator Author

dxqb commented Jan 17, 2026

Where is the 4B preset?

there is none. just replace the base model name

@pyros-projects
Copy link

Three test runs so far and everything works fine. Did not test qwen tho. Will report in detail later with examples and stuff.

Is there a way to do the in-training sampling with a different model than as defined under "model"? As in I train on the base model, but sampling happens with the distilled version.
Would also be handy for z-image.

If not is this something OneTrainer should be able to do?

@dxqb
Copy link
Collaborator Author

dxqb commented Jan 18, 2026

Three test runs so far and everything works fine. Did not test qwen tho. Will report in detail later with examples and stuff.

Is there a way to do the in-training sampling with a different model than as defined under "model"? As in I train on the base model, but sampling happens with the distilled version. Would also be handy for z-image.

If not is this something OneTrainer should be able to do?

keeping an entire second model in RAM isn't feasible for most people's hardware and for most models. It might be for your hardware and for the smallest models like Flux.Klein, but not generally.

I'd rather have LoRA adapters that can be enabled during sampling or training, as has been used for Z-Image. The issue here is that LoRA formats are inconsistent and it's a can of worm to try to load LoRAs created by other tools.
But you could open a feature request issue on github if you want that.

@nphSi
Copy link

nphSi commented Jan 18, 2026

Hey, just a quick report; I trained two loras with my (as simple as possible) settings i use for SDXL and ZI and it just worked surprisingly well on the first try. Masked gave errors and my LR was a bit too high but the results match and maybe beat ZI.
Tested on a 4060ti8GB at 512px. See my results here.
Settings: rmsprop, cosine, LR 5e4, ema,~2400 steps, warm 0.1, nbias 0.4, batch 1.

Thanks for all your efforts!

@Raphael023
Copy link

how to train a edit lora with target images and control images?

@Frostedbyte404
Copy link

Fetching 6 files: 100%|██████████████████████████████████████████████████████████████████████████| 6/6 [00:00<?, ?it/s]
flux-2-klein-base-9b-Q6_K.gguf: 100%|█████████████████████████████████████████████| 7.87G/7.87G [03:07<00:00, 41.9MB/s]
Traceback (most recent call last):
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\modules\modelLoader\Flux2ModelLoader.py", line 180, in load
    self.__load_internal(
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\modules\modelLoader\Flux2ModelLoader.py", line 53, in __load_internal
    raise Exception("not an internal model")
Exception: not an internal model

Traceback (most recent call last):
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\modules\modelLoader\Flux2ModelLoader.py", line 188, in load
    self.__load_diffusers(
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\modules\modelLoader\Flux2ModelLoader.py", line 79, in __load_diffusers
    transformer = Flux2Transformer2DModel.from_single_file(
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\venv\Lib\site-packages\huggingface_hub\utils\_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\venv\src\diffusers\src\diffusers\loaders\single_file_model.py", line 491, in from_single_file
    load_model_dict_into_meta(
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\venv\src\diffusers\src\diffusers\models\model_loading_utils.py", line 291, in load_model_dict_into_meta
    hf_quantizer.check_quantized_param_shape(param_name, empty_state_dict[param_name], param)
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\venv\src\diffusers\src\diffusers\quantizers\gguf\gguf_quantizer.py", line 84, in check_quantized_param_shape
    raise ValueError(
ValueError: double_stream_modulation_img.linear.weight has an expected quantized shape of: (24576, 4096), but received shape: torch.Size([24576, 8192])

Traceback (most recent call last):
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\modules\modelLoader\Flux2ModelLoader.py", line 196, in load
    self.__load_safetensors(
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\modules\modelLoader\Flux2ModelLoader.py", line 167, in __load_safetensors
    raise NotImplementedError("Loading of single file Flux2 models not supported. Use the diffusers model instead. Optionally, transformer-only safetensor files can be loaded by overriding the transformer.")
NotImplementedError: Loading of single file Flux2 models not supported. Use the diffusers model instead. Optionally, transformer-only safetensor files can be loaded by overriding the transformer.

Traceback (most recent call last):
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\modules\ui\TrainUI.py", line 750, in __training_thread_function
    trainer.start()
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\modules\trainer\GenericTrainer.py", line 129, in start
    self.model = self.model_loader.load(
                 ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\modules\modelLoader\GenericLoRAModelLoader.py", line 51, in load
    base_model_loader.load(model, model_type, model_names, weight_dtypes, quantization)
  File "C:\Users\Admin\Documents\LLM\OneTrainer_f\modules\modelLoader\Flux2ModelLoader.py", line 205, in load
    raise Exception("could not load model: " + model_names.base_model)
Exception: could not load model: black-forest-labs/FLUX.2-klein-base-9B

Flux 2 klein don't support override ?

@dxqb
Copy link
Collaborator Author

dxqb commented Jan 18, 2026

Flux 2 klein don't support override ?

the shapes don't match, which is usually a sign that you've used a GGUF file but haven't selected a GGUF dtype

@dxqb
Copy link
Collaborator Author

dxqb commented Jan 18, 2026

how to train a edit lora with target images and control images?

not supported, mainly because we lack the UI and dataloader support in OneTrainer overall. Supporting Flux2.Klein edit trainign is easy once that exists.

@Frostedbyte404
Copy link

Frostedbyte404 commented Jan 18, 2026

Flux 2 klein don't support override ?

the shapes don't match, which is usually a sign that you've used a GGUF file but haven't selected a GGUF dtype

--> That's strange, I selected the dtype as gguf, but it keeps giving me this error. I'm racking my brain trying to figure it out.
image

obs: I entered an invalid token for security reasons.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants